Cosine similarity
A common Similarity measure, especially for Continuous embedding vectors.
Ranges from (opposite) to (identical direction). It ignores magnitude and only compares direction, which is the main reason it’s preferred over Euclidean distance in high-dimensional spaces where distances concentrate.
Problems
- Not a true metric (violates triangle inequality).
- Unreliable for high-frequency words in embedding spaces — their representations cluster in a narrow cone, making cosine distances near-meaningless. Zhou2022cosine
- The cosine function’s derivative is 0 at 0, so it has poor resolution near 1 — often the most important region of interest. See Michael Trosset’s note (Sec. 1.3 and 2.7).
Cosine distance
Cosine similarity is often converted into a “distance”: mapping . But this inherits the resolution problem above.
Angular distance
An alternative that avoids the resolution issue:
Google’s Universal Sentence Encoder normalizes this into a similarity: